On Feature Subset Selection for Fuzzy and Classic Machine Learning Classification Methods

نویسندگان

  • Mariana V. Ribeiro
  • Heloisa A. Camargo
  • Marcos E. Cintra
چکیده

Feature subset selection supports the classification task by reducing the search space as well as by removing irrelevant and random features, which might compromise the resulting classification model. Decision trees perform an embedded feature selection as they select only the relevant features for the splitting of the datasets during the induction process. FUZZYDT is a fuzzy decision tree which uses entropy and information gain in its induction process. Its main advantage over classic decision tree algorithms is the transformation of the attributes into fuzzy linguistic attributes, adding interpretability to the induced models and allowing the processing of imprecision and uncertainty through the use of the fuzzy set and fuzzy logic theories. Filters are also widely used as they present low computational cost and can be applied as a preprocessing step. The large differences in the available approaches for feature selection motivated us to empirically test some methods specifically for fuzzy classification systems. Our initial hypothesis was that FUZZYDT would present better results for fuzzy classification systems when compared to other methods due to the fact that such fuzzy systems and FUZZYDT share the definition of the attributes in terms of fuzzy sets. The experiments carried out showed that the CFS filter produced better results than other filters, C4.5, and FUZZYDT. Such results, although contrary to our hypothesis, are relevant for our research with fuzzy systems, especially for genetic fuzzy systems due to their high computational cost, as CFS is simple and presents low computational cost. Keywords—Feature subset selection, fuzzy logic, fuzzy classification systems, decision trees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization

Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...

متن کامل

Feature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine

Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods.  In filter methods, features subsets are selected due to some measu...

متن کامل

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

Mental Arithmetic Task Recognition Using Effective Connectivity and Hierarchical Feature Selection From EEG Signals

Introduction: Mental arithmetic analysis based on Electroencephalogram (EEG) signal for monitoring the state of the user’s brain functioning can be helpful for understanding some psychological disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, or dyscalculia where the difficulty in learning or understanding the arithmetic exists. Most mental arithmetic recogni...

متن کامل

Modeling and design of a diagnostic and screening algorithm based on hybrid feature selection-enabled linear support vector machine classification

Background: In the current study, a hybrid feature selection approach involving filter and wrapper methods is applied to some bioscience databases with various records, attributes and classes; hence, this strategy enjoys the advantages of both methods such as fast execution, generality, and accuracy. The purpose is diagnosing of the disease status and estimating of the patient survival. Method...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014